智能论文笔记

StegaNeRF: Embedding Invisible Information within Neural Radiance Fields

Chenxin Li , Brandon Y. Feng , Zhiwen Fan , Panwang Pan , Zhangyang Wang

分类：计算机视觉

2022-12-03

Recent advances in neural rendering imply a future of widespread visual data distributions through sharing NeRF model weights. However, while common visual data (images and videos) have standard approaches to embed ownership or copyright information explicitly or subtly, the problem remains unexplored for the emerging NeRF format. We present StegaNeRF, a method for steganographic information embedding in NeRF renderings. We design an optimization framework allowing accurate hidden information extractions from images rendered by NeRF, while preserving its original visual quality. We perform experimental evaluations of our method under several potential deployment scenarios, and we further discuss the insights discovered through our analysis. StegaNeRF signifies an initial exploration into the novel problem of instilling customizable, imperceptible, and recoverable information to NeRF renderings, with minimal impact to rendered images. Project page: https://xggnet.github.io/StegaNeRF/.

translated by 谷歌翻译

AdaEnlight: Energy-aware Low-light Video Stream Enhancement on Mobile Devices

Sicong Liu , Xiaochen Li , Zimu Zhou , Bin Guo , Meng Zhang , Haochen Shen , Zhiwen Yu

分类：计算机视觉

2022-11-29

The ubiquity of camera-embedded devices and the advances in deep learning have stimulated various intelligent mobile video applications. These applications often demand on-device processing of video streams to deliver real-time, high-quality services for privacy and robustness concerns. However, the performance of these applications is constrained by the raw video streams, which tend to be taken with small-aperture cameras of ubiquitous mobile platforms in dim light. Despite extensive low-light video enhancement solutions, they are unfit for deployment to mobile devices due to their complex models and and ignorance of system dynamics like energy budgets. In this paper, we propose AdaEnlight, an energy-aware low-light video stream enhancement system on mobile devices. It achieves real-time video enhancement with competitive visual quality while allowing runtime behavior adaptation to the platform-imposed dynamic energy budgets. We report extensive experiments on diverse datasets, scenarios, and platforms and demonstrate the superiority of AdaEnlight compared with state-of-the-art low-light image and video enhancement solutions.

translated by 谷歌翻译

MicroAST: Towards Super-Fast Ultra-Resolution Arbitrary Style Transfer

Zhizhong Wang , Lei Zhao , Zhiwen Zuo , Ailin Li , Haibo Chen , Wei Xing , Dongming Lu

分类：计算机视觉 | 人工智能

2022-11-28

Arbitrary style transfer (AST) transfers arbitrary artistic styles onto content images. Despite the recent rapid progress, existing AST methods are either incapable or too slow to run at ultra-resolutions (e.g., 4K) with limited resources, which heavily hinders their further applications. In this paper, we tackle this dilemma by learning a straightforward and lightweight model, dubbed MicroAST. The key insight is to completely abandon the use of cumbersome pre-trained Deep Convolutional Neural Networks (e.g., VGG) at inference. Instead, we design two micro encoders (content and style encoders) and one micro decoder for style transfer. The content encoder aims at extracting the main structure of the content image. The style encoder, coupled with a modulator, encodes the style image into learnable dual-modulation signals that modulate both intermediate features and convolutional filters of the decoder, thus injecting more sophisticated and flexible style signals to guide the stylizations. In addition, to boost the ability of the style encoder to extract more distinct and representative style signals, we also introduce a new style signal contrastive loss in our model. Compared to the state of the art, our MicroAST not only produces visually superior results but also is 5-73 times smaller and 6-18 times faster, for the first time enabling super-fast (about 0.5 seconds) AST at 4K ultra-resolutions. Code is available at https://github.com/EndyWon/MicroAST.

translated by 谷歌翻译

AesUST: Towards Aesthetic-Enhanced Universal Style Transfer

Zhizhong Wang , Zhanjie Zhang , Lei Zhao , Zhiwen Zuo , Ailin Li , Wei Xing , Dongming Lu

分类：计算机视觉

2022-08-27

最近的研究表明，通用风格转移的成功取得了巨大的成功，将任意视觉样式转移到内容图像中。但是，现有的方法遭受了审美的非现实主义问题，该问题引入了不和谐的模式和明显的人工制品，从而使结果很容易从真实的绘画中发现。为了解决这一限制，我们提出了一种新颖的美学增强风格转移方法，可以在美学上为任意风格产生更现实和令人愉悦的结果。具体而言，我们的方法引入了一种审美歧视者，以从大量的艺术家创造的绘画中学习通用的人类自愿美学特征。然后，合并了美学特征，以通过新颖的美学感知样式（AESSA）模块来增强样式转移过程。这样的AESSA模块使我们的Aesust能够根据样式图像的全局美学通道分布和内容图像的局部语义空间分布有效而灵活地集成样式模式。此外，我们还开发了一种新的两阶段转移培训策略，并通过两种审美正规化来更有效地训练我们的模型，从而进一步改善风格化的性能。广泛的实验和用户研究表明，我们的方法比艺术的状态综合了美学上更加和谐和现实的结果，从而大大缩小了真正的艺术家创造的绘画的差异。我们的代码可在https://github.com/endywon/aesust上找到。

translated by 谷歌翻译

Learning inverse robot dynamics using sparse online Gaussian process with forgetting mechanism

Wei Li , Zhiwen Li , Yiqi Liu , Yongping Pan

分类：机器人

2022-07-30

通常用于从时间序列数据学习模型的在线高斯流程（GPS）比离线GPS更灵活，更健壮。 GPS的本地和稀疏近似都可以在线有效地学习复杂的模型。但是，这些方法假定所有信号都是相对准确的，并且所有数据都可以学习而无需误导数据。此外，在实践中，GP的在线学习能力受到高维问题和长期任务的限制。本文提出了一个稀疏的在线GP（SOGP），其遗忘机制以特定速度忘记了遥远的模型信息。所提出的方法结合了SOGP基础向量集的两个常规数据删除方案：基于位置信息的方案和最古老的基于点的方案。我们采用我们的方法来学习在任务切换的两部分轨迹跟踪问题下具有7度自由度的协作机器人的逆动力学。模拟和实验都表明，与两种常规数据删除方案相比，所提出的方法可实现更好的跟踪准确性和预测平滑度。

translated by 谷歌翻译

Texture Reformer: Towards Fast and Universal Interactive Texture Transfer

Zhizhong Wang , Lei Zhao , Haibo Chen , Ailin Li , Zhiwen Zuo , Wei Xing , Dongming Lu

分类：计算机视觉 | 人工智能

2021-12-06

在本文中，我们介绍了纹理改革器，一个快速和通用的神经基础框架，用于使用用户指定的指导进行交互式纹理传输。挑战在三个方面：1）任务的多样性，2）引导图的简单性，以及3）执行效率。为了解决这些挑战，我们的主要思想是使用由i）全球视图结构对准阶段，ii）局部视图纹理细化阶段和III）的新的前馈多视图和多级合成程序。效果增强阶段用相干结构合成高质量结果，并以粗略的方式进行细纹细节。此外，我们还介绍了一种新颖的无学习视图特定的纹理改革（VSTR）操作，具有新的语义地图指导策略，以实现更准确的语义引导和结构保存的纹理传输。关于各种应用场景的实验结果展示了我们框架的有效性和优越性。并与最先进的交互式纹理转移算法相比，它不仅可以实现更高的质量结果，而且更加显着，也是更快的2-5个数量级。代码可在https://github.com/endywon/texture --reformer中找到。

translated by 谷歌翻译

A Deep Learning Framework for Diffeomorphic Mapping Problems via Quasi-conformal Geometry applied to Imaging

Qiguang Chen , Zhiwen Li , Lok Ming Lui

分类：计算机视觉

2021-10-20

许多成像问题可以作为映射问题提出。一般的映射问题旨在获得最佳的映射，该映射可最大程度地减少给定约束的能量功能。解决映射问题的现有方法通常效率低下，有时可能被困在本地最小值中。当需要进行最佳映射为差异时，就会出现额外的挑战。在这项工作中，我们通过提出基于准文献（QC）Teichmuller理论的深入学习框架来解决问题。主要策略是学习Beltrami系数（BC），该系数代表了深度神经网络中的潜在特征向量的映射。卑诗省测量映射下的局部几何变形，可以增强深神经网络的解释性。在此框架下，可以通过网络中的简单激活函数来控制映射的差异属性。最佳映射也可以通过将BC整合到损失函数中来轻松正规化。提出的框架的关键优势是，一旦成功训练了网络，就可以实时获得与每个输入数据信息相对应的优化映射。为了检查提出的框架的功效，我们将方法应用于差异图像登记问题。实验结果在效率和准确性方面都优于其他最新的注册算法，这证明了我们提出的框架解决映射问题的有效性。

translated by 谷歌翻译

Improving Stack Overflow question title generation with copying enhanced CodeBERT model and bi-modal information

Fengji Zhang , Xiao Yu , Jacky Keung , Fuyang Li , Zhiwen Xie , Zhen Yang , Caoyuan Ma , Zhimin Zhang

分类：自然语言处理 | 人工智能

2021-09-27

上下文：堆栈溢出对于寻求编程问题答案的软件开发人员非常有帮助。先前的研究表明，越来越多的问题质量低，因此从潜在的答案者那里获得了更少的关注。 Gao等。提出了一个基于LSTM的模型（即BilstM-CC），以自动从代码片段中生成问题标题，以提高问题质量。但是，只有在问题主体中使用代码段无法为标题生成提供足够的信息，而LSTMS无法捕获令牌之间的远程依赖性。目的：本文提出了基于深度学习的新型模型CCBERT，旨在通过充分利用整个问题主体的双模式信息来增强问题标题生成的性能。方法：CCBERT遵循编码器范式范式，并使用Codebert将问题主体编码为隐藏的表示形式，堆叠的变压器解码器以生成预测的代币，以及附加的复制注意层来完善输出分布。编码器和解码器都执行多头自我注意操作，以更好地捕获远程依赖性。本文构建了一个数据集，该数据集包含大约200,000个高质量问题，该数据从Stack Overflow正式发布的数据中滤除，以验证CCBERT模型的有效性。结果：CCBERT优于数据集上的所有基线模型。对仅代码和低资源数据集进行的实验表明，CCBERT的优势性能较小。人类评估还显示了CCBERT关于可读性和相关标准的出色表现。

translated by 谷歌翻译

FloorPlanCAD: A Large-Scale CAD Drawing Dataset for Panoptic Symbol Spotting

Zhiwen Fan , Lingjie Zhu , Honghua Li , Xiaohao Chen , Siyu Zhu , Ping Tan

分类：计算机视觉

2021-05-15

访问大型和多样化的计算机辅助设计（CAD）图纸对于开发符号发现算法至关重要。在本文中，我们展示了地板平面图，这是一个大型现实世界CAD绘图数据集，包含超过10,000楼的计划，从住宅到商业建筑。 DataSet中的CAD图形都表示为矢量图形，这使我们能够提供30个对象类别的线粒化注释。通过这种注释配备，我们介绍了Panoptic符号发现的任务，这需要点发现可数件事的实例，也需要发现不可数的东西的语义。旨在解决这项任务，我们通过将图形卷积网络（GCNS）与卷积神经网络（CNNS）组合来提出一种新颖的方法，其捕获非欧几里德和欧几里德特征，并且可以训练结束到底。所提出的CNN-GCN方法在语义符号发现的任务上实现了最先进的（SOTA）性能，并帮助我们为Panoptic符号发现任务构建基线网络。我们的贡献是三倍：1）据我们所知，所呈现的CAD图形数据集是其第一个; 2）Panoptic Symbol Spotting Task考虑了事物实例的发现和语义作为一个识别问题; 3）我们基于新型CNN-GCN方法向Panoptic Symbol Spotting Task提供了基线解决方案，该方法在语义符号斑点上实现了SOTA性能。我们认为，这些贡献将促进相关领域的研究。

translated by 谷歌翻译

Symbolic Visual Reinforcement Learning: A Scalable Framework with Object-Level Abstraction and Differentiable Expression Search

Wenqing Zheng , S P Sharan , Zhiwen Fan , Kevin Wang , Yihan Xi , Zhangyang Wang

分类：机器学习 | 人工智能

2022-12-30

Learning efficient and interpretable policies has been a challenging task in reinforcement learning (RL), particularly in the visual RL setting with complex scenes. While neural networks have achieved competitive performance, the resulting policies are often over-parameterized black boxes that are difficult to interpret and deploy efficiently. More recent symbolic RL frameworks have shown that high-level domain-specific programming logic can be designed to handle both policy learning and symbolic planning. However, these approaches rely on coded primitives with little feature learning, and when applied to high-dimensional visual scenes, they can suffer from scalability issues and perform poorly when images have complex object interactions. To address these challenges, we propose \textit{Differentiable Symbolic Expression Search} (DiffSES), a novel symbolic learning approach that discovers discrete symbolic policies using partially differentiable optimization. By using object-level abstractions instead of raw pixel-level inputs, DiffSES is able to leverage the simplicity and scalability advantages of symbolic expressions, while also incorporating the strengths of neural networks for feature learning and optimization. Our experiments demonstrate that DiffSES is able to generate symbolic policies that are simpler and more and scalable than state-of-the-art symbolic RL methods, with a reduced amount of symbolic prior knowledge.

translated by 谷歌翻译